Preprocessing of Missing Values Using Robust Association Rules
نویسنده
چکیده
1 I n t r o d u c t i o n The missing values problem is an old one for analysis tasks[8] [11]. The waste of da t a which can result f rom casewise deletion of missing values, obliges to propose alternatives approaches. A current one is to t ry to determine these values [9]. However, techniques to guess the missing values mus t be efficient, otherwise the complet ion introduces noise. Wi th the emergence of K D D for industrial databases, where missing values are inevitable, this problem has become a priori ty task [6] also requiring declarat ivi ty and interactivi ty during t rea tments . At the present t ime, t rea tments are often specific and internal to the methods , and do not offer such qualities. Consequently the missing values problem is still a challenging task of the KDD research agenda [6]. We have proposed in [14] the R A R algor i thm to correct the weakness of usual association rules algorithms[2] in mining databases with mult iple missing values. The efficiency of this a lgor i thm to extract quickly all the associations contained in such a database, allows to use it for the missing values problem. T h a t is what 1 Knowledge Discovery in Databases 2 Robust Association Rules
منابع مشابه
An Interactive and Understandable Methodto Treat Missing Values :
Many analysis tasks have to deal with missing values and some of them have developed speciic and internal treatments to guess them. In this paper we present the use of a new method, called MVC (Missing Values Completion), for this question: MVC is based on data preprocessing which gives prominence to understandable associations and gives the user a central part. Such qualities will allow to use...
متن کاملMining rules from an incomplete dataset with a high missing rate
The problem of recovering missing values from a dataset has become an important research issue in the field of data mining and machine learning. In this thesis, we introduce an iterative missing-value completion method based on the RAR (Robust Association Rules) support values to extract useful association rules for inferring missing values in an iterative way. It consists of three phases. The ...
متن کاملUsing Association Rules to Make Rule-based Classifiers Robust
Rule-based classification systems have been widely used in real world applications because of the easy interpretability of rules. Many traditional rule-based classifiers prefer small rule sets to large rule sets, but small classifiers are sensitive to the missing values in unseen test data. In this paper, we present a larger classifier that is less sensitive to the missing values in unseen test...
متن کاملA Novel Algorithm for Association Rule Mining from Data with Incomplete and Missing Values
Missing values and incomplete data are a natural phenomenon in real datasets. If the association rules mine incomplete disregard of missing values, mistaken rules are derived. In association rule mining, treatments of missing values and incomplete data are important. This paper proposes novel technique to mine association rule from data with missing values from large voluminous databases. The p...
متن کاملAlgorithm for Missing Values Imputation in Categorical Data with Use of Association Rules
This paper presents new algorithm for missing values imputation in categorical data. The algorithm is based on using association rules and is presented in three variants. Experimental shows better accuracy of missing values imputation using new algorithm then using most common attribute value.
متن کامل